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ABSTRACT 


The  Data  Management  System  provides  the  lowest  level  support  for  the 
ILLIAC  IV  Information  Management  and  Analysis  System.   The  design  of  the  Data 
Management  System  is  based  on  an  experimental  technique  that  is  algorithmicly 
simple  but  tied  significantly  to  the  ILLIAC  IV s  computational  power.   Several 
"tuning"  requirements  have  evolved  because  of  the  nature  of  the  system  design. 
This  paper  presents  a  Queueing  Theory  Model  of  the  system.  The  solution  of 
the  model  provides  the  values  for  the  tuning  parameters. 
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1.   A  BRIEF  DESCRIPTION  OF  THE  DATA  MANAGEMENT  SYSTEM 

The  Data  Management  System  (DMSj  is  one  subsystem  of  the  ILLIAC  IV 
Information  Management  and  Analysis  System.  The  details  may  be  found  in 
reference  [3].  A  brief  description  of  the  DMS  follows. 

The  DMS  provides  the  lowest  level  support  for  the  Information 
Management  System  by  performing  two  tasks.  First,  it  provides  an  easily  usable 
general  purpose  file  acquisition  system  to  bring  files  from  the  archival  laser 
memory  to  the  ILLIAC  IV  disk.   Second,  retrieval  structures  called  key  elements 
provide  search  arguments  which  are  matched  according  to  some  relation  against 
other  key  elements  in  order  to  retrieve  a  subset  of  records  from  a  file  re- 
siding on  the  disk.  The  retrieved  records  are  examined  further  by  another 
subsystem.  This  paper  is  only  concerned  with  the  problems  of  retrieving 
records  from  files  on  the  disk  incurred  as  a  result  of  the  design  of  the  DMS. 

The  designers  of  the  DMS  have  relied  heavily  upon  the  ILLIAC  IV 
processing  power  and  incredibly  high  I/O  transfer  rate  (about  ^00  times  faster 
than  an  IBM  231^  disk  pack)  to  create  a  simple  DMS  which  has  a  high  access 
rate  but  also  minimizes  queueing  of  I/O  requests. 

Other  multi-key  access  systems  maintain  extensive  key  tables  that 
must  be  input,  examined,  and  merged  to  arrive  at  a  list  of  addresses  where 
records  reside  [k~\.     A  series  of  i/O  requests  are  then  made  to  retrieve  these 
records.  Figure  1  is  a  diagram  of  this  process. 

The  ILLIAC  IV  DMS,  on  the  other  hand,  merges  the  keys  and  the  records 
together  for  any  file.  Keys  that  identify  a  particular  record  reside  on  the 
disk  in  front  of  the  record.  When  a  record  is  to  be  found,  the  keys  are 
specified  and  in  one  I/O  request  the  entire  file  is  flushed  through  the 
ILLIAC  IV  processors  by  cyclicly  filling  up  buffers  in  memory.   As  long  as 
the  processor  keeps  ahead  of  the  transfer  rate,  all  the  records  can  be  found 
for  a  request  in  the  time  it  takes  to  pass  the  file.   Since  the  i/O  rate  is 
so  fast,  this  system  responds  well  ahead  of  other  systems,  yet  the  coding 
requirements  are  very  small.   Figure  2  is  a  schematic  of  the  process. 

Limited  core  space  has  been  a  problem  for  all  the  other  systems 
developed  for  the  ILLIAC  IV;  this  system  anticipates  the  same  problem. 
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To  keep  the  DMS  simple  we  have  decided  not  to  provide  a  dynamic  storage  allo- 
cation scheme  for  incoming  buffer  loads  of  records  and  keys.   The  problem  then 
is  to  determine  how  many  fixed-size  buffers  to  maintain  in  memory  and  what  size 
they  should  be.   We  must  provide  enough  space  so  that  the  fixed-sized  queue  of 
buffers  is  rarely  full.   If  the  queue  becomes  full  then  the  I/O  must  stop, 
while  the  processor  catches  up,  and  buffers  can  not  be  transferred  again  until 
the  disk  makes  a  complete  revolution.   If  i/O  stops  too  often,  the  system 
performance  will  deteriorate  beyond  an  acceptable  level. 

Sections  2  and  3  utilize  results  from  Queueing  Theory  to  establish 
the  number  and  size  of  input  buffers  to  be  maintained  in  core. 


2.   THE  QUEUEING  MODEL 


An  abstraction  of  the  Data  Management  System  is  shown  in  Figure  3. 
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Since  the  I/O  rate  is  constant,  the  interarrival  time  T  is  constant  and  it 
is  proportional  to  the  size  of  the  buffer.   The  larger  the  buffer,  the  slower 
the  arrival  of  a  buffer.   If  we  let  \   be  the  arrival  rate  of  one  row  to 
ILLIAC  IV  memory,  then  T  =  k/\  where  k  is  the  number  of  rows  in  each  buffer. 
Therefore,  the  cumulative  interarrival  time  distribution  is  given  by: 


A  (t)  =  { 


0  for  t  <  T 

1  for  t  >  T 


and 


T  =  k/\  . 


However,  the  time  required  to  service  or  process  a  buffer  depends 
on  the  complexity  of  the  buffer  and  the  complexity  of  the  search  being  per- 
formed.  The  greater  the  number  of  keys  associated  with  each  record  implies 
more  processing  time  to  determine  if  a  particular  record  is  to  be  retrieved. 
The  greater  the  number  of  keys  upon  which  various  relations  must  hold,  as 
determined  by  the  complexity  of  the  search,  the  longer  the  processing  time 
required  to  retrieve  records.   Since  these  quantities  are  unknown  prior  to  a 
system  request,  the  service  times  can  be  described  by  a  probability  distribu- 
tion.  Until  the  system  has  been  built,  the  service  time  distribution  cannot 


oe   determined.   However,  until  such  knowledge  can  be  obtained,  we  will  assume 
that  the  service  times  are  distributed  according  to  the  exponential  density 
function  with  mean  service  rate,  S  (buffers/unit  time).   The  service  rate  is 
also  proportional  to  the  size  of  the  buffer.   Therefore,  until  the  precise 
^relationship  can  be  determined,  we  will  assume  that  S  =  u/k  where  k  is  the 
lumber  of  rows  in  each  buffer  and  p.   is  the  average  service  rate  per  row. 
Finally,  the  service  time  distribution  is  given  by  the  following  density 
function: 

b(t)  =  Se"St 

and      S  =  u/k 

The  arrival  population  is  finite  but  very  large;  all  processed 
ouffers  are  returned  to  the  arrival  population  so  we  presume  that  the  input  to 
the  processor  is  infinite.   The  queue  discipline  is  first-come-first-served. 

Reiterating,  the  problem  is  to  determine,  in  some  optimum  sense,  the 
ninimum  number  of  buffer  spaces  needed  in  core  and  the  size  of  these  buffers. 
That  is,  we  want  to  restrict  the  length  of  the  queue  to  save  core  space,  but 
reasonably  minimize  the  possibilities  of  having  to  reject  an  incoming  buffer 
because  the  queue  is  full.   If  this  were  to  happen,  the  disk  would  have  to 
nake  a  full  rotation  before  the  buffer  could  be  resubmitted.   This  would 
ause  a  serious  time  delay  and  performance  would  deteriorate  if  this  incident 
bccurred  regularly. 

To  determine  the  size  of  the  restricted  queue,  we  will  establish 
the  probability  of  waiting  in  the  queue  longer  than  time  t.   We  will  then 
ietermine  the  probabilities  of  waiting  longer  than  integral  multiples  of  the 
average  service  rate.   By  choosing  an  integer,  n,  which  makes  the  probability 
small  of  waiting  longer  than  n  expected  services  we  will  have  good  assurance 
that  such  a  fixed  integer  number  of  buffers  will  not  cause  I/O  to  stop. 


3-   THE  SOLUTION 

Saaty  [1]  presents  Lin&Ley's  derivation  and  results  for  the  waiting- 
time  (in  queue)  distribution  for  a  general  independent  input  distribution  and 
general  independent  service  time  distribution  for  a  single  channel,  first- 
come-first-  served  queue. 

By  letting  t  be  the  arrival  interval  between  the  (n)th  and  (n+l)st 

unit  and  s  denote  the  service  time  of  the  (n)th  unit,  the  waiting  time  of  the 
n 

(n+l)st  unit  is  established  as 

W  +  U  ,  if  W  +U  >  0, 
,  n    n     n    n 

n+1   l0      ,  if  W  +  U  <  0, 
'     n    n  —  ' 

where  U  =s  -t.   U(U  )  is  the  cumulative  distribution  of  U  and  the 
n    n    n      n  n 

solution  given  by: 

00 

U(w)  =  1  -  /     b(y  +  w)  A(y)  dy 

where  A(y)  is  the  cumulative-interarrival-time  distribution  and  b(y)  is  the 
service  time  density  function. 

By  substituting  our  functions  for  A(y)  and  b(y)  it  was  shown  [1] 
that  the  probability,  P(<  t),  of  waiting  in  the  queue  less  than  t  is  given  by: 

P(<  t)  =  l-(l-p0)  e-^O* 


and  the  expected  waiting  time  is 

f    °° 
Wq  =  /      t  d  P(<  t)  = 

^0 

M?0 

where  pn,  the  probability  of  the  system  being  empty,  is  the  nonzero  root  of 
e-SpQT  =  l-p0. 

Reviewing, 

k  =  number  of  rows/buffer; 
S  =  (u/k  ;    and 
T  =  k/\  . 


To  find  P    we   solve 


e"STp0  =  l-pQ  (1) 

u 
-  po  1 

-  l-p0 


Let  the  solution  to  this  equation  be 


P0  =  ci  ' 

The  expected  waiting  time  is 

l-p     (1-p  )t  1-0 

\  -   Si7  =  1^-   "  C2k'  WherS  C2  =  —   • 

The  waiting  time  is  a  function  of  the  buffer  size,  k,  and  can  be  minimized  by 
reducing  the  buffer  size  to  one.   However,  this  would  not  be  optimal  for  disk 
storage  blocking  so  this  result  needs  to  be  examined  further  within  the  larger 
framework  of  the  whole  i/O  system. 

Since  l/S  is  the  average  processing  time,  we  establish  the  solution 
to  the  following  equation  to  determine  the  probability  of  waiting  in  the  queue 
longer  than  n/s,  n  =  1,  2,    .  .  . , 

P(<  t)  =  1  -  (1-  P0)e"SpOt 

=  1  -  (1  -  p0)e-  k  p0  n  H 

=  !-(!-  P0)e"riP0 

=  1  -   (1  -   C;L)e-n  Cl 
Note  P(<  t),    t  =  n/S     n  =  1,    2,    ...    ,    is  independent  of  k,    the  buffer   size. 

P(>  t)   =  1  -  P(<  t) 

=  (l-Cl)e-n  Cl      .  (2) 


To  determine  the  probability  that  an  arriving  buffer  will  wait  longer  than 
n  average  processing  units,  we  need  to  solve  equation  (2).   Therefore,  we 
would  choose  n  buffers  as  the  number  of  buffers  that  are  needed  to  give 
reasonable  assurance  of  not  stopping  I/O. 

To  provide  a  numerical  example,  a  rough  estimate  of  u   (which  must  be 
precisely  determined  later)  has  been  arrived  at  as 

fj.  -   .33  rows/usec. 

o 

Since  the  i/O  rate  is  109  bits/sec  and  there  are  approximately  k   x  10 
bits/row, 

\   —  .25  rows/jiisec. 
To  find  c  we  solve  equation  (l)  giving 

C-l  =  P0  =  0.1*5 

by  examination  of  tables  [2].  We  solve  equation  (2)  to  establish  the  following, 

table  to  determine  the  probability  of  waiting  in  the  queue  longer  than  n/S, 

n  =  1,  2,  ...,   .  ; 

n      t    P(<  t)     P(>  t)  =  1  -  P(<  t) 

•  55 
•35 
.22 
.Ik 
.09 
.06 
•  03 


0 

0 

.^5 

1 

1/S 

.65 

2 

2/S 

•  78 

3 

3/s 

.86 

k 

k/8 

•  91 

5 

5/S 

•  9h 

6 

6/s 

•  97 

Therefore,  the  probability  that  an  arriving  buffer  will  wait  longer 
ban  6  average  processing  time  units  is  only  0.03-   Thus,  under  our  assumptions 
nd  with  our  estimates,  if  we  provided  6  buffer  spaces  in  core  and  used  a 
yclic  buffer  technique  for  incoming  buffers,  we  have  reasonable  assurance  of 
Lot  having  to  stop  i/O  much  of  the  time  due  to  a  full  queue. 


10 


REFERENCES 

[1]   Saaty,  Elements  of  Queueing  Theory,  pp.  209-211. 

[2]   Standard  Mathematical  Tables,  Chemical  Rubber  Co.,  12th  Edition. 

[3]   Schuster,  Stewart  A.   "An  Information  Management  and  Analysis  System  for 
ILLIAC  IV",  CAC  Document  No.  1.  Urbana,  Illinois:   Center  for  Advanced 
Computation,  University  of  Illinois  at  Urbana-Champaign,  (December  11, 
1970). 

[k~\     Dodd,  G.   "Elements  of  Data  Management  Systems",  Computing  Surveys, 
Vol.  1,  No.  2,  June  1969. 


UNCLASSIFIED 

Security  Classification 

DOCUMENT  CONTROL  DATA  -R&D 

(Security  claaaltlcatlon  ot  tltlm,  body  of  abattmcl  and  Indamhtg  annotation  mttat  ba  ontmrod  whon  the  orarall  report  la  elaaalllad) 

riginating  activity  (Corporate  author) 

lenter  for  Advanced  Computation 

fniversity  of  Illinois  at  Urbana- Champaign 

Jrbana,   Illinois     6l801 

2a.  REPORT  SECURITY    CLASSIFICATION 

UNCLASSIFIED 

2b.    GROUP 

EPORT   TITLE 

?he  Tuning  of  Buffer  Parameters  for  the  ILLIAC  IV  Data  Management  System 

iESCRIPTIVE  NOTES  (Typa  ot  raport  and  htelualva  datma) 

Research  Report 

UTHOR(S)  (Flrat  naata,  middle  initial,  la  at  noma) 

tewart  A.    Schuster 

EPORT  DATE 

ranuary  15,   1971 

7a.    TOTAL  NO.   OF  PAGES 

15 

7b.    NO.   OF   REPS 

k 

CONTRACT  OR   GRANT  NO. 

USAF  30-(602)-klkk 

PROJECT  NO. 

ARPA  Order  788 

•a.    ORIGINATOR'S  REPORT  NUMBER(S) 

CAC  Document  No.    3 

•b.  OTHER  REPORT  NO(S>  (Any  othar  numbara  that  may  bo  aaalgnod 
thla  raport) 

DISTRIBUTION  STATEMENT 

Copies  may  "be  requested  from  the  address   given  in  (l)   above. 

SUPPLEMENTARY   NOTES 

None 

12.   SPONSORING  MILITARY    ACTIVITY 

Rome  Air  Development   Center 
Griffiss  Air  Force  Base 
Rome,   New  York       13^0 

ABSTRACT 

The  Data  Management  System  provides   the  lowest  level  support  for 
the  ILLIAC  IV  Information  Management  and  Analysis   System.      The  design  of  the 
Data  Management  System  is  based  on  an  experimental  technique  that  is 
algorithmicly  simple  but  tied  significantly  to  the  ILLIAC  IV' s   computational 
power.      Several  "tuning"  requirements  have  evolved  because   of  the  nature   of 
the  system  design.      This  paper  presents  a  Queueing  Theory  Model  of  the  system. 
The  solution  of  the  model  provides   the  values  for  the  tuning  parameters. 

D  ,Fr..1473 

UNCLASSIFIED 

Security 

Classification 

UNCLASSIFIED 


Security  Clarification 


